A Genetic Algorithm for Simplifying the Amino Acid Alphabet in Bioinformatics Applications
نویسندگان
چکیده
Simplified amino acid alphabets have been successful in several areas of bioinformatics, including predicting protein structure, predicting protein function, and protein classification. Since the number of possible simplifications is large, it is not practical to search through all possible simplifications to find one suitable for a specific application. A previous study conducted by the authors indicate that algorithms with heavy reliance on randomness tend to produce poor simplifications. Genetic algorithms have been generally successful in producing quality solutions to problems with a large solution space, though their reliance on randomness makes it difficult to create quality simplifications. This study’s goal is to overcome these difficulties, and create a genetic simplification algorithm. The presented results include the genetic simplification algorithm, as well as the difficulties of creating such an algorithm. The described algorithm has led to the development of a computer program that uses a genetic algorithm to produce simplified alphabets, and these outputs are listed and analyzed.
منابع مشابه
Simplifying amino acid alphabets by means of a branch and bound algorithm and substitution matrices
MOTIVATION Protein and DNA are generally represented by sequences of letters. In a number of circumstances simplified alphabets (where one or more letters would be represented by the same symbol) have proved their potential utility in several fields of bioinformatics including searching for patterns occurring at an unexpected rate, studying protein folding and finding consensus sequences in mul...
متن کاملComparing the Bidirectional Baum-Welch Algorithm and the Baum-Welch Algorithm on Regular Lattice
A profile hidden Markov model (PHMM) is widely used in assigning protein sequences to protein families. In this model, the hidden states only depend on the previous hidden state and observations are independent given hidden states. In other words, in the PHMM, only the information of the left side of a hidden state is considered. However, it makes sense that considering the information of the b...
متن کاملOptimizing amino acid groupings for GPCR classification
MOTIVATION There is much interest in reducing the complexity inherent in the representation of the 20 standard amino acids within bioinformatics algorithms by developing a so-called reduced alphabet. Although there is no universally applicable residue grouping, there are numerous physiochemical criteria upon which one can base groupings. Local descriptors are a form of alignment-free analysis, ...
متن کاملAutomated Alphabet Reduction Method with Evolutionary Algorithms for Protein Structure Prediction Biological Applications Track
This paper focuses on automated procedures to reduce the dimensionality of protein structure prediction datasets by simplifying the way in which the primary sequence of a protein is represented. The potential benefits of this procedure are faster and easier learning process and generation of more compact and human-readable solutions. This simplification consists of an alphabet reduction procedu...
متن کاملSequential and Mixed Genetic Algorithm and Learning Automata (SGALA, MGALA) for Feature Selection in QSAR
Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as: GA, PSO, ACO, SA and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003